Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Discovery of functional dependencies in university data based on affinity propagation clustering and TANE algorithms
HUANG Yongxin, TANG Xuefei
Journal of Computer Applications    2020, 40 (1): 90-95.   DOI: 10.11772/j.issn.1001-9081.2019061050
Abstract507)      PDF (1057KB)(455)       Save
In view of the missing values of datasets and the number of found functional dependencies is small and inaccurate in actual data quality detection process of universities, a university functional dependency discovery method combining Affinity Propagation (AP) clustering and TANE algorithm (APTANE) was proposed. Firstly, the Chinese field in the dataset was parsed row by row, and the Chinese field values were represented by the corresponding numerical values. Then, the AP clustering algorithm was used to fill the missing values in the dataset. Finally, the TANE algorithm was used to automatically find out the functional dependencies satisfying non-trivial and minimum requirements from the processed dataset. The experimental results show that after using AP clustering algorithm to repair real university dataset, compared with the direct use of functional dependency automatic discovery algorithm, the number of functional dependencies found increases to 80. The functional dependencies found after the filling of missing values represent the relationship between fields more accurately, reducing the workload of domain experts and improving the quality of data held by universities.
Reference | Related Articles | Metrics